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Abstract. Smoothed analysis of complexity bounds and condition numbers has 
been done, so far, on a case by case basis. In this paper we consider a reasonably 
large class of condition numbers for problems over the complex numbers and we 
obtain smoothed analysis estimates for elements in this class depending only 
on geometric invariants of the corresponding sets of ill-posed inputs. These 
estimates are for a version of smoothed analysis proposed in this paper which, 
to the best of our knowledge, appears to be new. Several applications to linear 
and polynomial equation solving show that estimates obtained in this way are 
easy to derive and quite accurate. 

1 Introduction 

1.1 Conic condition numbers — Main results 

A distinctive feature of the computations considered in numerical analysis is that 
they are affected by errors. A main character in the understanding of the effects of 
these errors is the condition number of the input at hand. This is a positive number 
which, roughly speaking, quantifies the effects just mentioned when computations 
are performed with infinite precision but the input has been modified by a small 
perturbation. It depends only on the data and the problem at hand. The best 
known condition number is that for matrix inversion and linear equation solving. 
For a square matrix A it takes the form k(A) = \\A\\ \\ A -1 || and was independently 
introduced by Goldstine and von Neumann jSH] and Turing }37j . 

Condition numbers occur in endless instances of round-off analysis. They also 
appear as a parameter in complexity bounds for a variety of iterative algorithms. 
Yet, condition numbers are not easily computable. It has even been conjectured |2j 
that computing the condition number ^(a) for a certain data a is at least as difficult 
as solving the problem for which a is a data. A way out for this situation is to assume 
a probability measure on the set of data and to study the condition number of this 
data random variable. 
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The above ideas have been systematized in a number of places. Notably, Blum [3] 
suggested a complexity theory for numerical algorithms parameterized by a condi- 
tion number "rf (a) of input data (in addition to input size). Then, Smale |3UL §1] 
extended this suggestion by proposing to obtain estimates on the probability distri- 
bution of ^(a). Combining both ideas, he argued, one can give probabilistic bounds 
on the complexity of numerical algorithms. 

Classically, probabilistic analysis of condition numbers takes two forms: bounds 
on the tail of the distribution of ff(a) — showing that it is unlikely that ^(a) will 
be large — and bounds on the expected value of ln(^(a)) — estimating the average 
loss of precision and average running time — . Examples of such results abound for 
a variety of condition numbers El EEH EE21 123 IHEj • 

Recently D. Spielman and S.-H. Teng §3] suggested a new approach to 
Smale's agenda above. The idea (e.g., for the distribution's tail) is to replace showing 
that 

"it is unlikely that ^(a) will be large" 
by showing that 

"for all a and all slight random perturbation Aa, it is unlikely that 
tf(a + Aa) will be large." 

A survey of this approach, called smoothed analysis, can be found in [SJ- We briefly 
describe its main features in Hl.21 

The goal of this paper is to give bounds for the smoothed analysis (both tail and 
expected value) for a large class of condition numbers for problems over the complex 
numbers. We assume our data space is C p+1 , endowed with a Hermitian product 
( , ). We say that is a conic condition number if there exists an algebraic cone 
£ C C p+1 (the set of ill-posed inputs) such that, for all data a, 

^ (a) = dist(a,£)' 

where || || and dist are the norm and distance induced by ( , ), respectively. 

As defined above, k(A) is not conic since the operator norm || || is not induced 
by a Hermitian product. Replacing this norm by the Probenius norm || \\p yields 
the (commonly considered) version kf(A) := \\A\\p\\A~ 1 \\ of k(A). The Condition 
Number Theorem of Eckart and Young ^3] then states that k(A)f is conic (with £ 
the set of singular matrices). Other examples can be found in [7j, where a certain 
property (related with the so called level-2 condition numbers) is proved for conic 
condition numbers. Furthermore, it is argued in jlUj that for many problems, their 
condition number can be bounded by a conic one. 

Note that, since £ is a cone, for all z E C \ {0}, ^(a) = ^{za). Hence, we may 
restrict to data a E P p := P P (C) for which the condition number takes the form 
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where, abusing notation, E is interpreted now as a subset of P p and dpv denotes the 
projective distance in P p (precise definitions follow in ^.ll below). We will denote by 
B(a, a) the open ball of radius a around a in P p with respect to projective distance. 

In what follows we assume E is purely dimensional and we write m = dim(£). 
Recall that the degree deg(£) of E equals (cf. fI3\ ) 

deg(£) = min{£ | #(£ n P p ~ m ) < £ for almost all P p ~ m C P p }. 

Our main result is the following. 

Theorem 1.1 Let ^ be a conic condition number with set of ill-posed inputs £ C 
P p , of pure dimension m, < m < p. Then, for all a G P p , all a G (0, 1], and all 
t > S^2l we have 



p—m ' 



Prob {tf(z) >t}< K(p, m)deg(E) ( — ) ( 1 + V 



zEB(a,a) V* "/ V p — mt(J 

and 

E Qnfiz)) < —-^ r QnK(p,m) + lndeg(£) +3) + ln-^- + 21n-, 

z&B(a,o) 2{p — m) p-m a 

3p 

with the constant K(p,m) := 2 m3m(p _ m )3( P - m ) ■ 

We will devote ^Slto derive applications of Theorem II. II to some condition num- 
bers which occur in the literature. 

In most of our applications, the set of ill-posed inputs £ is a hypersurface. That 
is, £ is the zero set Z(f) of a nonzero homogeneous polynomial / and thus deg(£) is 
at most the degree of /. In this case, we have the following easy to apply corollary. 

Corollary 1.2 Let ^ be a conic condition number with set of ill-posed inputs 
£ C P p . Assume £ C Z(f) with f G C[Xq, . . . , X p ] homogeneous of degree d. Then, 
for all a G W, all a G (0, 1], and all t > p^2, 

/1\ 2 / 1\2(P-1) 

Prob {V(z) >t}< 2p 3 e 3 d — 1 + p—\ 

and 

7 1 1 

E (ln<*f(z)) < -lnp+ -lnd + 4 + 21n-. 

z&B(a,a) 2 2 a 

The main idea towards the proof of Theorem 11.11 is to reformulate the probabil- 
ity distribution of a conic condition number as a geometric problem in a complex 
projective space. Indeed, for V C F p we denote by v(V) the volume of V, and by 
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V E the e-tube around V in P p (precise definitions follow in <J2. II below). With this 
notation, 

Prob {nz)>-)= Prob {*,(z,E)< e }- u(EenS(a '°' )) 



z£B(a,(r) I e\ zeB(a,a) ' v(B(a,a)) 

The first claim in Theorem 11.11 will thus follow from the following purely geometric 
statement. 

Theorem 1.3 Let V be a projective variety in ¥ p of pure dimension < m < p. 
Moreover, let a e P p , a € (0, 1], and < e < -^^y 1 - Then we have 

v(V e C]B(a,a)) /£\2W / p e x2ii ' 



< K(p,m) deg(V) (- 



1 + 



v(B(a,a)) ' Vcr/ \ p — ma 

One of the central tools in the derivation of Theorem 11.31 is integral geometry. 
An essential formula of integral geometry |221 §15.2] allows to relate the volume of 
certain geometric objects to the expected volume of their intersection when they 
are moved at random. A simple application is the equality v(V) = deg(V)v(¥ m ) for 
the volume of an irreducible m-dimensional subvariety V Q¥ p . In order to obtain 
a corresponding bound for V e n B(a,cr), a more sophisticated use of this equality is 
needed (cf. Lemma [2~2*)l . 



1.2 Relation to previous work 

Let IK = K or C. In the study of the behaviour of a function /: W 1 — > M+ (e.g., a 
condition number, a complexity bound) two frameworks have been extensively used: 
worst-case and average-case. Recently, a third framework has been proposed which 
arguably blends the best of the former two. The worst-case framework studies the 
quantity 

sup f(a) (2) 

and the average-case the quantity 

E /(*) = / f(z)ip(z)dz (3) 

where z G \P means that the expected value is taken for a random z whose distri- 
bution ^ has density function tp. The smoothed analysis of / studies the behaviour 
of 

sup E f(a + z) (4) 

aSK n z&N n (0,cr 2 ) 

(possibly for sufficiently small a) where N n (0, a 2 ) denotes the n-dimensional Gaus- 
sian distribution over IK with mean and variance a 2 . Note that while © and 
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(JHJ) usually yield functions on n, (@J) yields a function on n and a. It has been ar- 
gued that smoothed analysis interpolates between worst and average cases since it 
amounts to the first for a = and it approaches the second for large a. Instances 
of smoothed analysis can be found in jSJ ^1 1^ EH EH1 E2 • 

When / is homogeneous of degree — e.g., a conic condition number — it makes 
sense to restrict / to the projective space P n_1 (K). In this case, it also makes sense 
to replace the distribution a + N n (0, a 2 ) by the uniform distribution supported on 
the disk B(a,o~) C IP" -1 and consider, instead of Q, the following quantity 



Note that in this case, the interpolation mentioned above is transparent. When 
cr = the expected value amounts to f{a) and we obtain worst-case analysis, while 
if a = 1 (the diameter of P n_1 ) the expected value is independent of a and we obtain 
average-case analysis. 

It is this version of smoothed analysis we deal with in this paper. To the best of 
our knowledge it appears here for the first time. Note that while, technically, this 
"uniform smoothed analysis" differs from the Gaussian one considered so far, both 
share the viewpoint described in t il .11 above. 

We have already mentioned the references [SJ 1^ EH ESI H2] as instances of 
previous work in smoothed analysis. In all these cases, an ad hoc argument is used 
to obtain the desired bounds. This is in contrast with the goal of this paper which 
is to provide general estimates which can be applied to a large class of condition 
numbers. We believe the applications in JUgive substance to this goal. 

The idea of reformulating probability distributions as quotients of volumes in 
projective spaces (or spheres) to estimate condition measures goes back at least 
to Smale jUj and Renegar [20]. I n particular, uses this idea to show bounds 
on the probability distribution of a certain random variable in the average-case 
analysis of the complexity of Newton's method. Central to his argument is the fact 
that this random variable can be bounded by a conic condition number. The set of 
ill-posed inputs in [2U| is a hypersurface. An extension of these results to the case 
of codimension greater than one was done by Demmel where, in addition, an 
average-case analysis of several conic condition numbers is performed. Our paper is 
an extension of these arguments to the smoothed-analysis framework. 

In a recent paper, Beltran and Pardo obtained estimates similar to those 
proved by Demmel (always for the average-case setting) when the input data a is 
assumed to belong to a complex projective variety V C P p and averages are taken 
for the uniform distribution on V. An extension of Theorem II. II in this direction is 
certainly doable, but we have not included it in this paper. 

Probably the most important extension of the present paper would be to obtain 
a result akin to Theorem 11.11 (or Corollary I1.2JI for problems defined over the real 
numbers. For the average-case setting Demmel |llj states such results. Unfortu- 
nately, his results directly rely on an unpublished report by Ocneanu dating from 



sup E 

ae pn-i zeB(a,a) 




(5) 
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1985, which apparently contains an upper bound on the volume of tubes around a 
real variety in terms of degrees (cf. Theorem 4.3 in JJ). We are currently working 
towards an extension to the real case. 



2 Proof of Theorem 11.11 

2.1 Distances and volumes in projective space 

We refer to [U Chapter 12] for a more detailed introduction to the concepts needed 
here. A general reference for complex analytic geometry is |16j . 

The complex projective space P p := P P (C) is defined as the set of one dimensional 
complex subspaces of C p+1 . The space P p carries the structure of a compact 2p- 
dimensional real manifold. A Hermitian inner product ( , ) on C p+1 induces a 
Riemannian distance dn on P p (called Fubini-Study distance), which is defined as 

cIr(x, y) = arccos ]jbfl~i f° r x i V £ 

IfII imi 

where x,y are representatives of x and y in C p+1 , respectively, and || || denotes the 
norm induced by ( , ). 

The natural projection M 2p+2 \ {0} = C p+1 \ {0} -> P p factors through a (ev- 
erywhere regular) projection ir: S 2p+l — > P p with fiber S . It is easy to check that 
the restriction of the derivative dir(x) to the orthogonal complement of its kernel 
is orthogonal with respect to the Riemannian metrics on S 2p+l and P p induced by 
( , ). By means of the Co- Area formula [3J p. 241], this observation allows to reduce 
the computation of integrals on P p to the computation of integrals on S 2p+1 . More 
precisely, for any integrable function / : P p — > P and measurable [/CFwe have 

/ fdP p = ±- [ f 07T dS 2p+1 , (6) 

where dF p and dS 2p+1 denote the volume forms induced by ( , ). 

For an open subset U C M of an m-dimensional Riemannian manifold, we write 
v(U) := JjjdM for the m-dimensional volume of U, where dM is the volume form 
on M induced by the Riemannian metric. In particular, using © we get for the 
complex projective space 

,(P p ) = ^v{S 2p ^) = J. (7) 

Instead of the Riemannian metric dn on P p , we will be working with the associ- 
ated projective metric dpp, which is defined as 

dp P (x,y) = sind R (x,y). 
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Unless otherwise stated, this is the distance function we will be using throughout 
this paper. The use of this distance function is motivated by our applications. In 
fact, for a conic condition number with ill-posed set S C C p+1 , d-pp{a, S) (recall 
our abuse of notation in the introduction) just gives the normalized distance of a 
representative of a to S. 




Fig. 1 Three distances 



We denote by B(x, e) = B$> P (x, e) the open ball of radius e around x in P p (with 
respect to dpp), and by S p (x, e) the sphere of radius e around x. For a subset V C P p 
we define the e-tube around V in P p to be the open set 

V £ := {x G P p | d P p(x,F) < e}. 

We will also use the notation v e (V) := w(V^) for the volume of an e-tube in P p 
around a subset V C P p . If we wish to stress the ambient space in which the tube 
is considered, we will write v^ P (V) instead. We will similarly do so if the ambient 
space is a sphere. 

For a purely m-dimensional subvariety V C P p , the set V\Sing(U) (where 
Sing(U) denotes the singular locus of V) is a real 2m-dimensional Riemannian 
manifold (with the metric induced from P p ), and we define the volume of V as 
v(V) := u(V\Sing(V)). This coincides with any other reasonable notion of volume. 

Lemma 2.1 Let P p ~ m C P p and let < e < 1. Then 

vf (F p - m ) < v(F p - m )v(F m )e 2m , 

with equality if and only if p — m = 0. In particular, for the volume of a ball of 
radius e around x € P p we have 

v(B PP (x,e))=v(FP)e 2p . 
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Proof. A ball of radius e in P p with respect to d^p corresponds to a ball in P p of 
radius 5 = arccos(e) with respect to oIr. ^From Equation © we get the identity 

S 2 P +i /„ 2 (p-m)+n 

vf (PP" m ) = ^ >- . (8) 

Recall that on the sphere we use the usual Riemannian metric induced from the 
ambient space. We have thus reduced our problem to that of computing the vol- 
ume of a tube around a subsphere of a sphere. Expressions for this volume are 
straightforward to calculate: for a sphere S m C S p we have 

vf (S m ) = viS^viSP-™- 1 ) / cos^sin^"™- 1 ^. 

Jo 

Plugging this into Equation (jHJ) we get 

( q2(p-m)+l\ ( o2m-l\ f -8 
V W (p P -m) = )_ J cos{t) 2(p- m)+ l sin{t) 2m-l dt 

2TTv{FP- m )v(F m - 1 ) /' <5 cos(t) 2 ( p - m ) +1 sin(t) 2m - 1 dt 



2irv(P p - m )v(F m - 1 ) / (1 - u 2 y- m u 2m - x du, 



where in the last step we used the substitution u = sin(t). For < u < 1 we have 
(1 — u 2 ) p ~ m < 1, with equality if and only if p — m = 0. Substituting this bound in 
the above equation and evaluating the integral, we get 

where we used the fact that v(¥ m ) = v(¥ m ' 1 )ir/m for the last equality. □ 



2.2 A fact from integral geometry 

We will repeatedly use a variation of a classical formula from integral geometry. 
Let M, N C P p be submanifolds of (real) dimension 2m and 2n, respectively. The 
unitary group G := U(p + 1) acts transitively on P p in a straightforward way. A 
key result in integral geometry states that the expected volume of the intersection 
of M with a random translate gN of N satisfies 

EgecHM n gN)) = v(M)v(N) 

w(P m +"-p) v(F m )v(F n )' ^ ' 

Hereby the expectation is taken with respect to the normalized Haar measure on G. 
The above equality also holds if M and TV are (possibly singular) subvarieties of P p . 
Equation © is easily derived, using ©, from the corresponding statement in |221 
§15.2] for spheres. 
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2.3 Estimating the volume of patches of projective varieties 

The following lemma allows to estimate the volume of the intersection of a projective 
variety V with a ball in terms of the degree of V and the radius of the ball. 

Lemma 2.2 Let V C P p be an irreducible m-dimensional projective variety, a £ P p , 
< e < 1 and V = V n B PP (a, e). Then 

" (l "><de g (^W 



Proof. Taking M = W~ m and N = V in © we obtain 



p— mi 



where the expectation is over all g in the unitary group U p +\ taken w.r.t. the normal- 
ized Haar measure (so that U p+ ± has volume 1). Since \gV nP p_m | < |g , FnP p_m | < 
deg(V^) for almost all g G U p+ \ we obtain 

Ed^V n F p - m \) < deg(V) Prob {gV n P p_m / 0}. 
Since V' C B(a,e) we have 

Prob {#'nP p - m / 0} < Prob { 5 5(a,e) nf-™ / 0} = £ \ m . ; . 

The statement now follows from Lemma 12 . 1 1 using that u(PP) = 4- □ 

The following crucial lemma is the only step in our chain of argumentation that 
fails to be true over R. 

Lemma 2.3 [1, Theorem 22] Let V C P p be an irreducible projective variety of 
dimension m > 1, y G V and < e < l/y/2. Then we have 

v(VnB ¥P (y,e))>^v(F m )e 2m . □ 



2.4 Bounding the expectation 

The next result gives a convenient way to bound the expectation of a nonnegative 
random variable whose tail probabilities can be estimated by some power law. 
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Proposition 2.4 Let X be a nonnegative, absolutely continuous, random variable 
and ct,to,K be positive constants satisfying Prob{X > i) < Kt~ a for all t > to- 
Then we have 

E(lnX) < Into + -(lnK + l). 

a 

Moreover, ift < K« then E(lnX) < ± (In if + 1). 

Proof. Define the monotonically decreasing function g : (0, 1) — > R by 

\ Into otherwise. 

We claim that Prob{lnX > g(y)} < y for all y & (0, 1). Indeed, if y < Kt$ a then 
there exists t > to such that y = Kt~ a . Therefore, 

and 

Prob{lnX > g(y)} = Prob{lnX > hit} = Prob{X > t} < Kt~ a = y. 
If, instead, y > Kt^ then 

Prob{lnX > g(y)} = Prob{lnX > lnt } = Prob{X > t } < Kt Q a < y. 
Using [U Prop. 2, Ch. 11] it follows that 

E(lnX) < I g(y)dy 
Jo 

= - / -\n{y/K)dy+ / In t dy 
Jo a JKt~ a 

-ln(y/K)dy+ I \nt Q dy 
) « Jo 

3 1 

H — In if + lnt 
1 a 

1 



< 



o 

-yQny- l) 

a 



a 



-(1+lnlf) +lnt . 



If to < K ot then Kt a > 1 and the integral above has only its first term. □ 
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2.5 Proof of main results 



Proof of Theorem M.tX It is sufficient to prove the assertion for an irreducible V . In 
order to see this recall that deg(V) = deg(Vi) + • • • + deg(V q ), where Vi,...,V q are 
the irreducible components of V which we assume to be all of the same dimension. 

So we assume that V is irreducible. We follow the arguments in ^ Proof of 
Theorem 16]. Fix e% € (0, 1] such that < e\ — e < (we will specify E\ later). 
For each z € V e there exists y £ V such that dpp(z,y) < e and hence B{y,e\ — e) C 
B(z,ei). 




Fig. 2 The thick curve segment is V n B(y, E\ — e) 



Since s\ — e < -k= we may use Lemma 12.31 to obtain 

v 2 

v(V n £l )) > v(V n B(y, ei - e)) > ^(P m )(£i - e) 2m . (10) 

In order to estimate v(V e n -B(a, a)) we put V' := V H B(a, a + ex) an d note that 
VnB(z,ei) C V' for all z G 7 £ n%(j). 
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\B(a,a + Ei) 
B(a,o) 




Fig. 3 The thick curve segment is V = V n B(a, a + E\) 
and the shaded region is V e D B(a, a) 



Using (fT0|) we have 
v(V £ <lB(a,a)) 



v(PP) JzeV e nB(a,a 



1 dz 



< 



< 



v(^ p ) J Z £ Vs nB(a,„) ^(P m )(ei-e) 



2 v(Vr\B(z, ei)) 

2m~ 



In addition, 

1 



u(PP) 



u(P m )(ei -e) 2m u 
t)(^'nB(z,ei))(iz = 



-— / u(V'nJ3(«,ei))cte. 



t;(y'nB(^ ,£i))^ 

13 (w> m,<V) v(B(z ,e 1 )) 
= v{lr 



v{¥ m ) v(PP) 

where zq is any point in W and the second equality follows from Q. Using 
Lemma 12. II we conclude that 



On the other hand, by Lemma 12.21 we have 



v(V n B(z, S!))dz = v{V')sf . 
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since V' = V n B(a, a + e±). Combining all the above we get the estimate 

«(pp) - ( £l -e) 2 - W (p-) £i - ( £1 - £ )2- degl,/j Uy l j ' 

Using again v{B{a,o~)) = v(F p )a 2p it follows that 

n B(a, .)> £ 2^ (fl y r deg(F) /„N 



u(J3(o,(7)) - (ei - e) 2m V a / bV ' \m 
We finally choose E\ := Note that then 

m 1 

e\ — e = e < — = 

p — m a/2 

as we needed, the inequality since e < -L^r^. We obtain 

V 2 m 

- Z^wl Z^T^TT „. deg(y) I - ) 1 + 



v(B(a,a)) m 2m (p — m) 2 ( p ~ m } \mj \aJ \ p — ma 

Taking into account the estimate (^) < m m ( p ^m)p~ m ( wmc ^ readily follows from 
the binomial expansion of pP = (m + (p — m)) p ) we finish the proof. □ 

Proof of Theorem \l.l\ The inequality for the tail follows directly from Theorem ll,3l 
For the expectation estimate, let £q := and to := £n • Note that, for e < eo, 



2m / -i \ 2m 

2 



p — ma J \ m J 



v(V £ nB(a,a)) „ . . ,^/e\ 2 (p-™) 2 

; < if(p, m)deg(S) - e 2 . 



and thus 



v(5(a, <t)) Vo" 
Therefore, for all t > to, writing e = 1/t, 

Prob {1f(z) > t} = Prob {d(z, S) < e} 

zeS(a,o-) z£B(a,cr) 



v(V £ r\B(a,a)) 
v(B(a,a)) 

I \ 2(p-m) 



< K(p, m)deg(E) - e 2 t 



(7 



) 2,-2(p-m) 



A straightforward application of Proposition 12.41 yields 

E (ln^(z)) < -—- r (In K(p, m) + In deg(S) +3) + In + 2 In-. 

zeB(a,<r) 2{p-m) p - m a 



□ 
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1 



< 



1 



Proof of Corollary\T2A Put £' = Z(f) and note that V(a) =■ dpp(a;S) ^ d rP {a,T,') - 
The assertion follows from Theorem II . II applied to E' and the inequality 



K(p,p-1)=2 



P 



,3p 



1 + 



p — 1 



p-1 



p 3 < 2eV- 



□ 



3 Some Applications 

In this section we obtain smooth analysis estimates for the condition numbers of 
four problems: linear equation solving, Moore-Penrose inversion, eigenvalue com- 
putations, and polynomial equation solving. For the first two, instances of such 
analysis already exist and we therefore compare our results with those in the liter- 
ature. The following differences, however, should be noted. Firstly, these analyses 
were done for problems over the reals. Secondly, they hold within the Gaussian 
framework for smoothed analysis described in HI. 21 The first feature is not impor- 
tant since a cursory look at the refered proofs shows that similar results hold for 
complex matrices. One should though keep in mind the second. 



3.1 Linear equation solving 

The first natural application of our result is for the classical condition number k(A). 
In M. Wschebor showed (solving a conjecture posed in that, for all n x n 
real matrices M with ||M|| < 1, all < a < 1 and all t > 

Kn 

Prob (k(M + E)>t)< 

EeN n2 (0,a 2 ) &t 

with K a universal constant. Note that, by Proposition 12.41 this implies 

E (In k(M + E)) < Inn + In- +\nK + 1. 

Be N" 2 (0,a 2 ) a 

We next compare Wschebor 's result with what can be obtained from Corollary 11.21 
To do so, we first note that, for A £ C nxn , 

k(A) = WAWWA- 1 ]] < WAWfWA' 1 ]] =: k f (A) 

and that, by the Condition Number Theorem of Eckart and Young ( see also HI 
Theorem 1, Chapter 11]), ||^4 _1 || = dp(A, E) . Here || \\f and oIf are the Frobenius 
norm and distance in C nxn which are induced by the Hermitian product (A,B) i— > 
trace(^4B*). It follows that kf(A) is conic. We can thus give upper bounds for 
kf(A) and they will hold as well for k(A). 
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Proposition 3.1 For all n > 1, < a < 1, and M £ £ nxn we have 

E (lnKp(A)) < — lnn + 21n ( - ) + 4, 
AeB(M,o-) 2 \er/ 

wiiere tie expectation is over all A uniformly distributed in the disk of radius a 
centered at M in projective space F n _1 (recall that we always use the projective 
and not the Riemannian distance). 

Proof. The variety X of singular matrices is a hypersurface in P 71 _1 of degree re. 
We now apply Corollary 11.21 □ 

Note, the bound in Proposition 13.11 is of the same order of magnitude than 
Wschebor's, worse by just a constant factor. On the other hand, its derivation 
from Corollary 11.21 is rather immediate. We next extend this bound to rectangular 
matrices. 



3.2 Moore-Penrose inversion 

Let I > re and consider the space <C ixn of I x re rectangular matrices. Denote by 
X C C f ' xn the subset of rank-deficient matrices. Let A X and let A^ denote 
its Moore-Penrose inverse (see, e.g., jH |3])- The condition number of A (for the 
computation of A^) is defined as 

A](A , v \\{A + ^ -A^\\ 2 \\A\\ 2 
cond 1 (A) = hm sup — ,, - , ,, ,, - - ,, — —. 

^°\\AA\\ 2 <e ||At|| 2 ||AA|| 2 

This is not a conic condition number but it happens to be close to one. One defines 
k)(A) = \\A\\ 2 \\A^\\2 and, since \\A^\\ 2 = dist(AE)- 1 US], we obtain 



dist(A^) 
In addition (see |S1 §HL3]), 

k) (A) < cond^A) < 1 + ^ k\A). 

Thus, ln(cond^(A)) differs from \n(K^(A)) just by a small additive constant. As for 
square matrices, k) (A) is not conic since the operator norm is not induced by a 
Hermitian product in C^ xn . But, again, we can bound n^{A) by the conic condition 
number k< f (A) := \\A\\ F \\A^\\. 

A smoothed analysis for k^(A) was performed in jSj. Computer experiments 
reported in that paper, however, suggest that the exhibited bounds, while sharp 
when t is close to re, are not so for more elongated matrices. Actually, an empirical 
average Avr(ln k)(A)) was computed for several pairs (n,£) and matrices of the 
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form A = M + A with M a fixed ill-posed matrix and A a small perturbation. It 
was then mentioned [HI §7] that "one sees that when one fixes n and lets £ increase 
the quantity Avr(ln k){A)) decreases. This is in contrast with the behaviour of [our 
bound]. It appears that our methods are not sharp enough to capture the behaviour 
of E(ln k^(A))." As we next see, the bounds following from Theorem 11.11 capture 
this behaviour much better. 

The bound shown in [S] is of the form 

sup E (iniJ(A + E)) < O (lni + ln- J . (11) 

AeR exn E£N e "(0,cr) \ OJ 

It depends on £ and tends to oo when £ does so. Our next result shows that for 
large £, the expected value above (now with respect to uniform perturbations) is 
bounded by an expression depending only on n and a. 

Proposition 3.2 For all n > 1 and < a < 1 we have 
limsup sup E (InKpiA)) < (n+ | J ln(n) +nln2 + 2 + (n + l)ln-. 

Proof. It is well known that (the image in P n ^ _1 (C) of) S is a projective variety 
of co dimension I — n + 1 and degree (JlJ (see H3 Examples 12.1 and 19.10]). By 
Theorem HU for all M G F and t > t = 1 



/ i \ 2(p— m) / i \ 2m 

Prob IkIU) > t} < K(p, m)deg(S) ( — ] ( 1 + — ] 

AeB(M,a) 1 V ; V ; \ta J \ p-mtaj 



< K(p, m)deg(E) — 



j \ 2(p-m) / 2p \ 2m 



to J \ a (P ~ m ) 

with 



p = £n— 1, m = £n — ^ + n — 2, and deg(S) 

, n — 1 



Therefore, by Proposition 12.41 

E (hi4(A)) < - 1 lnfap,m)deg(S) ( 2p ) +l)+\n-. 
AeB(M,a) 2(p-m) \ \a{p-m)J J a 



We next bound the logarithms of the expressions inside the parenthesis. 

To bound the binomial coefficients we use the following estimates (see |381 
(1.4.5)]) 

■n(")<M ^ < yH g 
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where H denotes the binomial entropy function defined by H(z) = —zlnz — (1 — 
z)ln(l — z) for z G (0, 1). Note that H is monotonically increasing on (0, 5) and 
H(z) = H(l — z) for z € (0, 1). 

It will be convenient to use the asymptotic notations f(n,£) ~ g{n,£) and 
f( n ,£) i$ <?( n )^) to express that lim = 1 and limsup ^ n ' i } < 1, respectively. 

We obtain 



p H f=) = PH(^) < , h(^±1) ~ ft, H(i) < <( 1 + in 



p / \ p J \ in — 1 / \n 

using 

/ 1 \ 1, n— 1, n 1, 

H - = -InnH In < -(1 + lnn). 

\n/ n n n — 1 n 

Hence lnif(p, m) — 2>i{l + Inn). Similarly, 

lndeg(E)=ln( M < < rein -. 



n 



,u- iy - v^y ~ re 

Finally, 



Therefore, 



2m In ( — — — - J ~ 2&i(lnn + ln~). 
\cr(p — m) J a 



sup E (lnK^(A)) 

< ^ ^3£(1 + lnre) +nln ^-^ + 2in (lnn + ln-X\ + ln- 

< + lnre+ (re + 1) In - + nln2 + 2, 

which shows the claim. □ 



Remark 3.3 The bound in Proposition l3.2l is independent of I. Yet, its dependance 
on n is linear and the term on In — is multiplied by a factor n. This is too large 
a bound. We now note that bounds such as Qll|) also follow from our results. For 
a very short derivation, note that if a matrix A is rank deficient then, det(A) = 
where A is the n x n matrix obtained by removing all rows of A with index greater 
than re. Therefore S C £ = {.4 € C^ xri | det(Z) = 0}. This implies that, if 

= i 



£ is a hypersurface of 

yields 



„-l(A,£) 

Since £ is a hypersurface of degree re, an immediate application of Corollary 11.21 



sup E (In re* (A)) < - ln^ + 41nre + 4 + 21n ( - ] 

MgP «n-i AeB{M,a) 2 \cr/ 
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For small I (say, polynomially bounded in n) this last bound is better than that in 
Proposition 13.21 We conjecture that an asymptotic bound of the form 0(ln(n) + 
ln(l/o-)) holds. 



3.3 Eigenvalue computations 

Let A G C nxn and A G C be a simple eigenvalue of C. For any sufficiently small 
perturbation AA there exists a unique eigenvalue A of A + AA close to A. It is 
known |18| that 

\X -X\ < \\P\\\\AA\\ + 0{\\AA\\ 2 ) (12) 
where P G c nxri is the projection matrix given by 

P = (y H xy 1 xy H . 

Here x and y are right and left eigenvectors associated to A, respectively, (i.e., 
satisfying Ax = Xx and y A = Xy H ) and y H is the transpose conjugate of y. Note 
that y H x is a scalar. Furthermore, inequality (|12[) is sharp in the sense that the 
factor IIPII can not be decreased. We can then define 



k(A,X) :-- 



\\P\\ if A is simple 
oo otherwise 



and the (absolute) condition number of A for eigenvalue computations 

^eigen(^) '■= K,(A, X) , 

A 

where the maximum is over all the eigenvalues A of A. Note that n e i gen (A) is 
homogeneous of degree in A. Also, the set £ where K e i gen is infinite is the set of 
matrices having multiple eigenvalues. Finally, Wilkinson proved that 



V2\\A\ 
dist(A,E)" 



K eigen04) < ±J~ A ^ ■ (13) 



In Demmel used the fact that the right-hand side of (|13|) is conic to obtain 
bounds on the tail of K e \ gen (A) for random A. We next use it to obtain smoothed 
analysis estimates. 

Proposition 3.4 For alln>l and M G C nxn , 

E (ln/CeigenM-)) < 8 Inn + 2 In- + 5. 

AeB(M,a) ° O 

Proof. Let \A be the characteristic polynomial of A. This is a monic polynomial of 
degree n whose coefficient of degree i is a homogeneous polynomial of degree n — i in 
the entries of A. Clearly, A has multiple eigenvalues if and only if \A has multiple 
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roots. This happens if and only if the discriminant disc(xA) of A is zero. The 
discriminant disc(xA) is a polynomial in the entries of A, which can be expressed in 
terms of the eigenvalues X%, . . . , X n of A as follows 

d\sc( XA )=i[(\ i -\ j ) 2 - 

i<j 

Note that aX±, . . . , aX n are the eigenvalues of a A, for a E C. Hence 
disc( XaA ) = ^(aA, - aXj) 2 = a^HiXi - A,) 2 . 

i<j i<j 

We conclude that disc(xA) is homogeneous of degree re 2 — n in the entries of A 
We now apply Corollary 11.21 with p = n 2 — 1 and d = n 2 — n to get (use (|13p ) 

E (lnK eigen (A)) < 81nn + 21n- + 4 + -In 2. □ 

AeB(M,a) ° a 2 

3.4 Complex polynomial systems 

Let di, . . . ,d n £ N \ {0}. We denote by Ttd the vector space of polynomial systems 
/ = (/i, ... ,f n ) with /j G C[Xo, . . • , X n ] homogeneous of degree dj, i = 1, . . . , re. 
For f,g£ TLd we write 

where a = {olq, ...,(%) is assumed to range over all multi- indices such that \a\ = 
ELo «* = * and X Q := A^X? 1 -I n a ". 

The space is endowed with a Hermitian inner product (f,g) = Y^H=\{fi-i9i)i 
where 



\a\=di 

Here, the bar denotes complex conjugate and the multinomial coefficients are defined 
by: 

d\ d\ 



aj <xq\oi\\ ■ ■ • a n \ 



Note that choosing this Hermitian product amounts to choosing the monomials 



(d>)X Q ag or thonormal basis of Ti^. 

In the case of one variable, this product was introduced by H. Weyl [10]. Its use 
in computational mathematics goes back at least to Kostlan ^]. Throughout this 
section, let ||/|| denote the corresponding norm of /. As described in H2,l( the Weyl 
product defines a Riemannian structure on the corresponding space ¥(7id), with its 
associated projective distance dp(^ d ). 
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In a seminal series of papers, M. Shub and S. Smale US HE1 HZ] studied 
the problem of, given / G Tid, compute (an approximation of) a zero of /. They 
proposed an algorithm and studied its complexity in terms of, among other parame- 
ters, a condition number /i norm (/) for /. We recall its definition (see [U Chapter 12] 
for details). For a simple zero C, G P ra of / G Ttd one defines 



Mr 



(/, C) := ll/H ( J D/(C)|T c )- 1 diag(v / ^||Cr 1 - 1 , . . . , 



|rfn-l- 



where -D/(C)|t c denotes restriction of the derivative of /: C n+1 — > C n at £ to the 
tangent space T f P n = {v G C n+1 | (v,Q = 0} of P n at (. Note that /Wm(/,C) is 
homogeneous of degree in / and If / has only simple zeros Cli ■ ■ ■ ? Cg we define 

(/) = 

— max /inorm 

otherwise we set /U norm (/) = oo. The study of // n orm(/) plays a central role in the 
series of papers above. A main result is the following [25] (see also |H Theorem 1, 
Chapter 13]). 

Theorem 3.5 Let n > 1. The probability that /i norm (/) > 1/e for / G P(Wd) an d 
e > is iess than or equaJ to 

e 4 n 3 (n + 1)JV(JV - 1)2? 

n 

where dim^d = N + I and V = di is the Bezout number. 

i=i 

We want to extend Theorem 13.51 to a smoothed analysis of // n orm(/)- To do 
so, we first bound /i norm (/) by a conic condition number. Let E C P(7id) be the 
discriminant variety, which consists of the systems / G ¥(7id) having multiple zeros. 
The Condition Number Theorem [3] §12.4] states that, for a zero C G P n (C) of /, 

/'norm (7,0 



d P(Wd) (/,Eny f )' 

where F C := {/ G P(H d ) | /(C) = 0}. Therefore, 

A t norm(/) = max /i n orm(/> Ci) = — : j 77 ^ ^ ,/ \ — 



' minj< 9 (i P(Wd) (/,Sn d P(Wd) (/,E) 

We can now proceed with the desired extension. 

We identify the fi with their coefficient vectors in C^, where Ni = ( n ^)- Set 
N = N{ — 1 so that E C ¥ N . Our next result bounds the degree of E. Similar 
bounds were given in |2Ul Proposition 6.1]. 
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Lemma 3.6 The discriminant variety S C is a hypersurface, defined by a 
multihomogeneous polynomial of total degree 



^1 + ^1 - n + j>2 dij ^2 V - 2nT)2 



in the coefficients of f±, . . . , f n . 

Proof. Given n + 1 homogeneous polynomials fo,...,f n in CpQ), X\, . . . , X n ] of 
degrees di, it is known (see |35l §4.2]) that there exists an irreducible polyno- 
mial res(/o, • • • , f n ) in the coefficients of the fi (unique up to a scalar) such that 
res(/o, . . . , f n ) = if and only if the system /o = • • • f n = has a projective so- 
lution. (The polynomial res is called the multivariate resultant.) Moreover, res is 
multihomogeneous of degree T\j^=i dj in the coefficients of each /j. 
Now define 

ti(fi, ■ ■ ■ , fn) ■= res(5, /i,...,/ n ), 

where g := det(d/i, . . . ,df n ,Y^iXidXi). A solution £ to the system / = is de- 
generate if and only if the dfi(Q are linearly dependent, which is the case if and 
only if <?(£) = (here we used Euler's identity, stating that for homogeneous fi and 
all x G C™ +1 , dfi(x) is orthogonal to x). It follows that 5(fi,...,f n ) defines the 
discriminant variety E. 

For the degree calculations, note first that degg = 1 + Y^=i(di — 1) = 1 — n + 
Ya=i di- We thus obtain 

n V ( - \ 

deg6(f l ,...,f n )=V + deggJ2j = vi 1 + deg^cU 

i=l \ i=l / 

as claimed. This degree can be (rather crudely) estimated by 2riD 2 . □ 



Theorem 3.7 For all f G F(H d ), all a G (0, 1], and all t > N^2 we have 

/ i \ 2 / i \ 2 ( N ~V 

Prob {n. novm (g) > t} < 4N 3 e 3 nV 2 — 1 + N— 

9<EB(f,* \toJ V t(7 s 

and 

7 1 / 1 

E (hi/x norm (<?)) < -lniV + lnP + -lnn + 5 + 21n - 

g€B(f,a 2 2 \(T 

Proof. It follows from Corollary 11.21 and Lemma 13.61 □ 
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